Voice Disorder Classification Based on Multitaper Mel Frequency Cepstral Coefficients Features
نویسندگان
چکیده
The Mel Frequency Cepstral Coefficients (MFCCs) are widely used in order to extract essential information from a voice signal and became a popular feature extractor used in audio processing. However, MFCC features are usually calculated from a single window (taper) characterized by large variance. This study shows investigations on reducing variance for the classification of two different voice qualities (normal voice and disordered voice) using multitaper MFCC features. We also compare their performance by newly proposed windowing techniques and conventional single-taper technique. The results demonstrate that adapted weighted Thomson multitaper method could distinguish between normal voice and disordered voice better than the results done by the conventional single-taper (Hamming window) technique and two newly proposed windowing methods. The multitaper MFCC features may be helpful in identifying voices at risk for a real pathology that has to be proven later.
منابع مشابه
Voice-based Age and Gender Recognition using Training Generative Sparse Model
Abstract: Gender recognition and age detection are important problems in telephone speech processing to investigate the identity of an individual using voice characteristics. In this paper a new gender and age recognition system is introduced based on generative incoherent models learned using sparse non-negative matrix factorization and atom correction post-processing method. Similar to genera...
متن کاملMultitaper MFCC Features for Acoustic Stress Recognition from Speech
Ameliorating the performances of speech recognition system is a challenging problem interesting recent researchers. In this paper, we compare two extraction methods of Mel Frequency Cepstral Coefficients used to represent stressed speech utterances in order to obtain best performances. The first method known as traditional is based on single window (taper) generally the Hamming window and the s...
متن کاملCombining amplitude and phase-based features for speaker verification with short duration utterances
Due to the increasing use of fusion in speaker recognition systems, one trend of current research activity focuses on new features that capture complementary information to the MFCC (Mel-frequency cepstral coefficients) for improving speaker recognition performance. The goal of this work is to combine (or fuse) amplitude and phase-based features to improve speaker verification performance. Base...
متن کاملCombining evidences from mel cepstral, cochlear filter cepstral and instantaneous frequency features for detection of natural vs. spoofed speech
Speech synthesis and voice conversion techniques can pose threats to current speaker verification (SV) systems. For this purpose, it is essential to develop front end systems that are able to distinguish human speech vs. spoofed speech (synthesized or voice converted). In this paper, for the ASVspoof 2015 challenge, we propose a detector based on combination of cochlear filter cepstral coeffici...
متن کاملExploring Kernels in Svm-based Classification of Larynx Pathology from Human Voice
In this paper identification of laryngeal disorders using cepstral parameters of human voice is investigated. Mel-frequency cepstral coefficients (MFCC), extracted from audio recordings, are further approximated, using 3 strategies: sampling, averaging, and estimation. SVM and LS-SVM categorize preprocessed data into normal, nodular, and diffuse classes. Since it is a three-class problem, vario...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
دوره 2015 شماره
صفحات -
تاریخ انتشار 2015